2,099 research outputs found

    Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion

    Full text link
    Foreign accent conversion (FAC) is a special application of voice conversion (VC) which aims to convert the accented speech of a non-native speaker to a native-sounding speech with the same speaker identity. FAC is difficult since the native speech from the desired non-native speaker to be used as the training target is impossible to collect. In this work, we evaluate three recently proposed methods for ground-truth-free FAC, where all of them aim to harness the power of sequence-to-sequence (seq2seq) and non-parallel VC models to properly convert the accent and control the speaker identity. Our experimental evaluation results show that no single method was significantly better than the others in all evaluation axes, which is in contrast to conclusions drawn in previous studies. We also explain the effectiveness of these methods with the training input and output of the seq2seq model and examine the design choice of the non-parallel VC model, and show that intelligibility measures such as word error rates do not correlate well with subjective accentedness. Finally, our implementation is open-sourced to promote reproducible research and help future researchers improve upon the compared systems.Comment: Accepted to the 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Demo page: https://unilight.github.io/Publication-Demos/publications/fac-evaluate. Code: https://github.com/unilight/seq2seq-v

    Using Online Games To Teach Personal Finance Concepts

    Get PDF
    This case study explores the use of online games to teach personal finance concepts at the college level. A number of free online games targeting such topics as budgeting and saving, risk and return, consumer credit, financial services, and investments were introduced to the experimental group as homework assignments. Statistical results indicate that integrating online games into coursework significantly enhanced student learning outcomes. We suggest extending our successful experience to groups of people who need financial knowledge the most

    Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders

    Full text link
    An effective approach to non-parallel voice conversion (VC) is to utilize deep neural networks (DNNs), specifically variational auto encoders (VAEs), to model the latent structure of speech in an unsupervised manner. A previous study has confirmed the ef- fectiveness of VAE using the STRAIGHT spectra for VC. How- ever, VAE using other types of spectral features such as mel- cepstral coefficients (MCCs), which are related to human per- ception and have been widely used in VC, have not been prop- erly investigated. Instead of using one specific type of spectral feature, it is expected that VAE may benefit from using multi- ple types of spectral features simultaneously, thereby improving the capability of VAE for VC. To this end, we propose a novel VAE framework (called cross-domain VAE, CDVAE) for VC. Specifically, the proposed framework utilizes both STRAIGHT spectra and MCCs by explicitly regularizing multiple objectives in order to constrain the behavior of the learned encoder and de- coder. Experimental results demonstrate that the proposed CD- VAE framework outperforms the conventional VAE framework in terms of subjective tests.Comment: Accepted to ISCSLP 201

    Learning satisfaction of undergraduates in single-sex-dominated academic fields in Taiwan

    Get PDF
    AbstractThe present study investigated relationships between undergraduates’ learning satisfaction, academic identity, self-esteem and feeling of depression and loneliness in Taiwan. Data were from a national survey in Taiwan. Participants were 15,706 third-year undergraduates (8719 female, 6987 male). The results showed that, after controlling for undergraduates’ academic performance and attitudes toward university and department, (1) learning satisfaction of females in male-dominant fields was negatively correlated with their feeling of depression, (2) learning satisfaction of males in female-dominant fields was positively correlated with their academic identity and self-esteem, and (3) learning satisfaction of undergraduates in non-dominated fields was positively correlated with their academic identity and self-esteem but also negatively correlated with their feelings of depression

    Document Recommendation in Organizations with Personal Folders

    Get PDF
    In organizations, knowledge workers usually have their own personal folders that store and organize needed codified knowledge (textual documents) in taxonomy. In such personal folder environments, providing knowledge workers needed knowledge from other workers’ folders is important to facilitate knowledge sharing. This work adopts recommendation techniques to provide knowledge workers needed textual documents from other workers folders. Experiments are conducted to verify the performance of various methods using data collected from a research institute laboratory. The result shows that the CBF approach outperforms other methods

    Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition

    Full text link
    Research on automatic speech recognition (ASR) systems for electrolaryngeal speakers has been relatively unexplored due to small datasets. When training data is lacking in ASR, a large-scale pretraining and fine tuning framework is often sufficient to achieve high recognition rates; however, in electrolaryngeal speech, the domain shift between the pretraining and fine-tuning data is too large to overcome, limiting the maximum improvement of recognition rates. To resolve this, we propose an intermediate fine-tuning step that uses imperfect synthetic speech to close the domain shift gap between the pretraining and target data. Despite the imperfect synthetic data, we show the effectiveness of this on electrolaryngeal speech datasets, with improvements of 6.1% over the baseline that did not use imperfect synthetic speech. Results show how the intermediate fine-tuning stage focuses on learning the high-level inherent features of the imperfect synthetic data rather than the low-level features such as intelligibility.Comment: Submitted to ICASSP 202

    Unmasking stem-specific broadly neutralizing epitopes by abolishing N-linked glycosylation sites for vaccine design

    Get PDF
    Targeting highly conserved HA stem regions has been proposed as a useful strategy for designing universal influenza vaccines. The influenza virus HA stem region, consisting of a HA1 N-terminal part and full HA2 part, contains several potential sites for the addition of N-glycans. We expressed a series of recombinant HA (rHA) mutant proteins with deleted N-linked glycosylation sites in the HA1-stem and HA2-stem regions of H5N1 and pH1N1 viruses. Unmasking N-glycans in the HA2-stem region (rH5HA N484A and rH1HA N503A) did not affect the trimeric structure of HA. Immunizations using rH5HA N484A and rH1HA N503A elicited more potent neutralizing antibody titers against homologous, heterologous and heterosubtypic viruses. Unmasking the HA2-stem N-glycans of rH5HA N484A induced higher levels of stem-specific CR6261-like and FI6v3-like antibodies, improved the ability of stem-specific anti-fusion antibodies, enhanced H5 stem helix A epitope-specific B and T cell responses in splenocytes, and provided better protection against both homologous and heterosubtypic virus challenges. These findings suggest that HA2-stem N-glycan unmasking holds potential as a useful design strategy for developing more broadly protective influenza vaccines
    • …
    corecore